AITopics | provably efficient neural gtd

Collaborating Authors

provably efficient neural gtd

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Provably Efficient Neural GTD for Off-Policy Learning

Neural Information Processing SystemsDec-24-2025, 04:56:09 GMT

This paper studies a gradient temporal difference (GTD) algorithm using neural network (NN) function approximators to minimize the mean squared Bellman error (MSBE). For off-policy learning, we show that the minimum MSBE problem can be recast into a min-max optimization involving a pair of over-parameterized primal-dual NNs. The resultant formulation can then be tackled using a neural GTD algorithm. We analyze the convergence of the proposed algorithm with a 2-layer ReLU NN architecture using $m$ neurons and prove that it computes an approximate optimal solution to the minimum MSBE problem as $m \rightarrow \infty$.

name change, off-policy learning, provably efficient neural gtd, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Review for NeurIPS paper: Provably Efficient Neural GTD for Off-Policy Learning

Neural Information Processing SystemsJan-25-2025, 21:51:56 GMT

Weaknesses: The philosophy of establishing convergence guarantees for neural networks under specific numbers of neurons is strange, because the number of neurons is a very coarse description of a network that can already be established by nonparametric estimators, i.e., Cho, Youngmin, and Lawrence K. Saul. And numerous follow up works. Therefore, if the neural network analysis is to refine this approach, then it must also specify the *inter-layer* relationships and broader architectural choices to actually be useful to practitioners. As is, I don't see how the m of Lemma 4.1 can actually be used to inform choice of a neural architecture in any sharper manner than, e.g., a single layer RBF network. Also, reformulating Bellman's equations into saddle point problems has been previously studied: Shapiro, A. (2011).

learning, off-policy learning, provably efficient neural gtd, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.97)

Add feedback

Review for NeurIPS paper: Provably Efficient Neural GTD for Off-Policy Learning

Neural Information Processing SystemsJan-25-2025, 21:51:49 GMT

This paper generated substantial discussion from the reviewers. Reviewer 1's points of lack of contextualization are well-taken by the other reviewers. That said, the meta-reviewer (in consultation with the Senior Area Chair) agrees that the theoretical contribution will be of interest to the NeurIPS community, and the clarity & sharpness of the authors' response suggests the authors are quite capable of revising the paper to more clearly discuss context and articulate their contribution. As such, the metareviewer is recommending accept.

neurips paper, off-policy learning, provably efficient neural gtd, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback

Provably Efficient Neural GTD for Off-Policy Learning

Neural Information Processing SystemsOct-10-2024, 13:46:46 GMT

This paper studies a gradient temporal difference (GTD) algorithm using neural network (NN) function approximators to minimize the mean squared Bellman error (MSBE). For off-policy learning, we show that the minimum MSBE problem can be recast into a min-max optimization involving a pair of over-parameterized primal-dual NNs. The resultant formulation can then be tackled using a neural GTD algorithm. We analyze the convergence of the proposed algorithm with a 2-layer ReLU NN architecture using m neurons and prove that it computes an approximate optimal solution to the minimum MSBE problem as m \rightarrow \infty .

algorithm, off-policy learning, provably efficient neural gtd, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.32)

Add feedback